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Many materials science phenomena, such as growth and self-organisation, are dominated by ac- 
tivated diffusion processes and occur on timescales that are well beyond the reach of standard- 
molecular dynamics simulations. Kinetic Monte Carlo (KMC) schemes make it possible to overcome 
this limitation and achieve experimental timescales. However, most KMC approaches proceed by 
discretizing the problem in space in order to identify, from the outset, a fixed set of barriers that are 
used throughout the simulations, limiting the range of problems that can be addressed. Here, we 
propose a more flexible approach — the kinetic activation-relaxation technique (k-ART) — which 
lifts these constraints. Our method is based on an off-lattice, self-learning, on-the-fly identification 
and evaluation of activation barriers using ART and a topological description of events. The validity 
and power of the method are demonstrated through the study of vacancy diffusion in crystalline 
silicon. 

PACS numbers: 02.70.-c, 61.72.Cc, 81.10.Aj, 61.72.jd 



Many problems in condensed matter and materials sci- 
ence involve stochastic processes associated with the dif- 
fusion of atoms over barriers that are high with respect 
to temperature and therefore inherently slow under "nor- 
mal" conditions. Because the associated rates are small, 
these processes may be considered independent; neglect- 
ing the thermal motion of atoms, it is thus possible to 
deal with them using the kinetic Monte Carlo (KMC) 
algorithm, a stochastic approach proposed by Bortz et 
al. [U [3j [4] and based on transition state theory, 
whereby the evolution of a system is determined by a set 
of pre-specified diffusion mechanisms, i.e., whose energy 
barriers are known beforehand. In KMC simulations, 
the timescale is determined by the fastest activated pro- 
cesses and, in practice, timescales of ms or longer can 
be reached — much longer than accessible in traditional 
molecular-dynamics (MD) simulations. 

While KMC has been extensively and successfully used 
over the past 20 years, it suffers from a number of draw- 
backs. In particular, the systems investigated must be 
discretized and mapped onto a fixed lattice in order to 
define the various diffusion mechanisms that need to be 
considered at a given moment [3] . Once all processes on 
the lattice have been identified (and their barriers evalu- 
ated) a priori, the simulations simply consist in operating 
a diffusion event picked at random, updating the list of 
possible moves in the new configuration, and iterating 
this procedure long enough to cover the relevant physi- 
cal timescales. This approach works very well for simple 
problems (e.g., surface diffusion, metal-on- metal growth) 
but fails when the systems undergo significant lattice de- 
formations or when long-range elastic effects are impor- 
tant. There have been numerous efforts to lift these limi- 
tations, most solutions falling into one of two categories: 
introduction of continuum approximations for the long- 



range strain deformations, and on-the-fly evaluation of 
the energy barriers. The first category retains the lattice 
formulation but adds long-range contributions — which 
can be computed through various extrapolation schemes 
- to the barriers [5j [6] . With the second class of solu- 
tions, there is no need to set-up a catalog of all possi- 
ble activation mechanisms. In a recently proposed self- 
learning KMC approach, Trushin et al. introduced an 
on-the-fly search for barriers but displacements were re- 
stricted to be on-lattice[7 . In other cases, a limited num- 
ber of activated events using the the ART-like dimer [5J[9] 
or eigenvector-following methods [10 are generated at 
each step in order to construct a small catalog which 
serves to determine the next move. Thus, these meth- 
ods |6j [7] are still limited by the lattice description of 
the problem and the approximate character of the elastic 
energies. On-the-fly /off-lattice approaches [51 19} [TO] , on 
the other hand, while more flexible, are currently inef- 
ficient as they do not take advantage of the knowledge 
of previously-encountered events, and are therefore only 
useful for small systems with very few barriers. 

In this Letter, we introduce a powerful on-the-fly /off- 
lattice KMC method which achieves speed-ups as large 
as 4000 over standard MD for complex systems, while 
retaining a complete description of the relevant physics, 
including long-range elastic interactions. Our approach is 
based on the activation-relaxation technique (ART nou- 
veau) [TTJ [12] for generating events and calculating barri- 
ers; the gain in efficiency is achieved through a topolog- 
ical classification of atomic environments, which allows 
configurations and events to be recognized and stored ef- 
ficiently, and used again as the simulation proceeds, i.e., 
the method is self-learning. We demonstrate the validity 
and efficiency of this kinetic ART (k-ART) approach by 
applying it to the problem of vacancy diffusion in crys- 
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FIG. 1: (Color online) Local topology analysis in k-ART: a 
truncated graph (b) is extracted from the complete lattice 
around the highlighted central atom (a), and analysed using 
nauty (c) which returns a unique key associated with the given 
topology (d). 



talline silicon and comparing to full MD simulations. 

Before describing k-ART, it is useful to discuss the 
topological characterization. For each configuration, a 
connectivity graph formed by the network of local neigh- 
bors is first constructed. These may correspond to cova- 
lently bonded atoms in semiconductors, or faces in the 
Voronoi' tesselation of compact materials. It is impor- 
tant that the configuration be uniquely defined through 
this network, i.e., the connectivity graph must lead to a 
unique structure once relaxed with a given interatomic 
potential. In order to classify the activated processes, a 
truncated graph is constructed around each atom, as il- 
lustrated Fig. [I] the size of which depends on the physics 
of the system under study. In the case of Si, for example, 
we define the local environnement around an atom by a 
sphere of radius 5.0 A, which includes about 40 atoms; 
two neighbours are bonded if their distance is less than 
2.8 A. An event is defined as a change in the topology 
of the local graph. This classification is performed using 
the freely available topological software nauty, developed 
by McKay [13 , which provides the topology index and all 
information necessary for uniquely identifying each envi- 
ronment, including the permutation key needed to recon- 
struct a specific geometry from the generic topology and 
a set of reference positions. 

Events which have been learned are stored for subse- 
quent use; in practice, the atomic positions of the initial 
state as well as the associated topologies for the initial, 
transition and final truncated graphs are saved in mem- 
ory. If needed, the transition and final state configura- 
tions may be reconstructed from the reference geometry 
through a series of symmetry operations extracted from 
the topological analysis. This results in a considerable 
reduction in the amount of data that needs to be gen- 
erated and manipulated. For a single vacancy in c-Si, 
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FIG. 2: (a) Total squared displacement as function of KMC 
steps (or time, in the inset), (b) Distance between the two 
vacancies as a function of KMC steps. 



for example, only 20 different topologies are necessary to 
describe all possible local environments, irrespective of 
the system size. Moreover, as the system evolves and 
previously-encountered topologies are recognized, it is 
only necessary to update the table of active events, the 
cost of which is negligible as we will see below. 

We now turn to a detailed description of the k-ART 
algorithm. Starting from an initial relaxed configura- 
tion, the various local topologies are characterized with 
nauty and, for each topology, possible events are con- 
structed with ART nouveau [TTJ [12], which has been 
shown to efficiently identify the relevant diffusion mech- 
anism in a wide range of systems, either crystalline or 
amorphous, with both empirical and ab initio meth- 
ods [TH [I5j [I6j [17] . Within this approach, the configura- 
tion is slowly pushed along a randomly selected direction 
until an unstable direction appears in the Hessian; this is 
followed while minimising the energy in the perpendicu- 
lar hyperplane until the system converges onto a saddle 
point and the system is then pushed over the barrier and 
relaxed into a new minimum. Since activated processes 
involve only a finite number of atoms, each event is initi- 
ated by displacing a given atom and its neighbours within 
a small, local region in a random direction. The exact 
size of the displacement regions depends on the system 
under study; in semiconductors, they typically involve 
first and second nearest-neighbours. The initial conver- 
gence criterion for the saddle point search is set to 1.0 
eV/A in order to accelerate convergence (but see below). 

To simplify labelling, each event is assigned to the 
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topology centered on the atom that moved the most dur- 
ing the event, irrespective of the initial trial assignment. 
The events are stored as displacement vectors from the 
reference state to the transition and the final states; these 
are used to reconstruct all specific events associated with 
a given topology throughout the lattice. Once the list 
of topologies and associated barriers is set (or has been 
updated), all low-energy events (which we define for Si 
as having barriers of 15 ksT or less) are reconstructed 
from the topology and re-relaxed with a stricter con- 
vergence criterion of 0.1 eV/A in order to accurately 
take into account the local environment and the long- 
range interactions, leading to a precision of about 0.01 
eV on the barrier height. At this point, two types of 
events are in the catalog: (1) "generic" events, that in- 
clude all high-energy barriers, and (2) "specific" events, 
where all low-energy barriers, dominating the kinetics, 
are relaxed individually. We associate a transition rate 
Ti = ro exp (AEi/ksT) to each event, where To is fixed at 
the outset and, for simplificity, assumed to be the same 
(= 10 13 8 _1 ) for all events. From this list, and follow- 
ing Bortz et al. PQ, the elapsed time to the next event is 
computed as At = — In \ij n, with /i a random num- 
ber in the [0, 1[ interval. Finally, an event is selected 
with a weight proportional to its rate and is operated; 
the clock is pushed forward and the process starts again: 
The topology of all atoms belonging to the local envi- 
ronment around the new state is constructed; if a new 
topology is found, a series of ART nouveau searches are 
launched; otherwise, we proceed to the next step. After 
all events are updated, the low-lying barriers are, once 
again, relaxed before calculating the time increment and 
selecting the next move. 

As the system evolves, it may get trapped in a set of 
local configurations separated by very low energy bar- 
riers that dominate the dynamics without yielding dif- 
fusion. An exact solution to dealing with such "flick- 
ers" has been proposed by Athenes et al. [18 , but we 
elect here to use a simpler limited-memory Tabu-like ap- 
proach [19] which proceeds by banning transitions rather 
than states [20]. In brief, at any given moment, we keep 
in memory (the "memory kernel" ) the n previous transi- 
tions. If a planned transition is already in memory, it is 
blocked and the initial or the final configuration of this 
move is adopted with the appropriate Boltzmann proba- 
bility; the transition is also blocked for the next n jumps 
and removed from the list of possible events. As was 
shown in Ref. [20], this approach is thermodynamically 
exact and is kinetically valid as long as the memory is 
short compared to the timeline of evolution of the sys- 
tem. 

We now demonstrate the validity and efficiency of our 
method by studying the diffusion of systems of two and 
six vacancies in a 1000-atom Stillinger- Weber c-Si sam- 
ple. For the 2- vacancy system, we start by removing two 
second-neighbour Si atoms, then perform a k-ART run 




KMC steps 

FIG. 3: (Color online) Total squared displacement as function 
of KMC steps for the 6- vacancy system. 

for 200 CPU hours on a single 1.5 GHz Itanium 2 pro- 
cessor. During this time, the vacancies perform about 
1000 jumps, corresponding to a diffusion time of about 
100 /is. Figure [2] (a) shows the total squared displace- 
ment of the atoms as a function of KMC steps (i.e., 
events) and, in the inset, effective time. Two types of 
behaviour are clearly visible. During the first 400 steps 
(~ 4 /is), diffusion takes place through correlated single 
vacancy hops, the two vacancies maintaining a separa- 
tion oscillating between 3.85 and 4.5 A. At about step 
400, the two vacancies become trapped as a single diva- 
cancy, characterized by small local rearrangements, and 
remain so for about 80 /is before partially breaking apart 
and resuming its 2-vacancy correlated walk. The corre- 
lated motion is best seen in Fig. [2] (b), where we plot 
the distance between the two vacancies as a function of 
KMC steps : the two vacancies remain bound in a first or 
second-neighbour state for the whole simulation, except 
for occasional excursions to larger distances. This strik- 
ing result illustrates perfectly the strength of k-ART in 
handling fully the impact of lattice deformation on dif- 
fusion, which here induces the two vacancies to return 
rapidly to a tightly-correlated configuration. 

For the 6-vacancy problem, now, we start with a con- 
figuration containing two 3-vacancy clusters placed far 
away from each other, as shown in the t = snapshot in 
Fig. [3] This configuration is challenging because the dy- 
namics is dominated by a series of local rearrangements 
and reorientations associated with low-energy barriers 
that preserve the compactness of the cluster, i.e., break- 
ing it appart is very difficult. To test this, we first ran a 
30 ns MD simulation at 500 K; no dissociation or diffu- 
sion events were observed. Likewise, nothing happened 
in a 5000-step k-ART simulation without memory ker- 
nel, which covered 8 /is. These two calculations required 
roughly the same computational effort; we thus already 
conclude that k-ART is at least 250 times faster than 
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MD; this is fast, but we can do much better by invoking 
the memory kernel to eliminate the flicker problem which 
is inherent to such complex materials. 

Thus, we carried out a third simulation of the 6- 
vacancy problem using k-ART and the memory kernel. 
Figure [3] shows the squared displacement as function of 
KMC steps. We observe, in agreement with the previous 
two simulations, that the initial state is fairly stable: the 
system flickers during the first 20 /as (160 KMC steps), 
in agreement with the MD and the k-ART simulation 
without memory kernel. At 20 /is, one vacancy breaks 
away from the top right cluster and quickly moves to the 
other cluster, forming a 4- vacancy chain and leaving a 
divacancy behind. As in the 2-vacancy simulation, this 
divacancy diffuses through the box for about 45 /is (525 
KMC steps) as a correlated pair. Finally, at event 685 
(65 /is), the divacancy breaks appart and one vacancy 
rapidly joins the larger cluster. The lone monovacancy 
diffuses through the lattice during 25 /is and eventually 
joins the 5-vacancy cluster, forming a stable hexavacancy 
chain with a total energy 2 eV lower than that of the 
initial configuration. Not surprisingly, for the last 150 
KMC steps (20 /is), the diffusion becomes negligible and 
we observe only unsuccessful attempts to dissociate the 
hexavacancy cluster. 

In terms of efficiency, k-ART with the memory kernel is 
about 4000 times faster than MD, with 110 /is simulated 
in 220 CPU hours. In k-ART most of the computational 
time is spent in identifying events associated with new 
topologies. This is clear in Figure [4] where we plot CPU 
time versus simulated time for k-ART, with the learning 
phases indicated by arrows. Sampling is considerable: at 
the end of this run, 17237 different events were generated, 
associated with 1964 initial topologies, for an average of 
almost 9 events per topology; at each step during the 
simulation, the system presents about 80 to 120 different 
topologies. Since each atom is associated with a topology, 
about 9000 different barriers are considered at each KMC 
step. 

While the cost of k-ART is significantly higher than 
lattice KMC, it is ideally suited to problems that have 
been left aside until now because of the importance of off- 
lattice positions, surface reconstruction and long-range 
elastic effects, as we have shown in the study of 2 and 6 
Si vacancy diffusion. Because the method is inherently 
local, a number of improvements can be envisaged that 
will yield considerable acceleration to the code. Paral- 
lelizing the management of events and barriers, for ex- 
ample, should speed up the calculations by a factor of 
10 or 20. Moreover, because of the topological classifica- 
tion, the catalog of events may be stored and reused at 
a later time, thus accelerating new simulations. Kinetic 
ART is an exciting self-learning, off-lattice kinetic Monte- 
Carlo algorithm that opens the door to the numerical 
study of problems such as semiconductor growth, self- 
organization, defect diffusion and interface mixing, that 
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FIG. 4: (Color online) CPU time versus simulated time for 
the hexavacancy aggregation problems. Red arrows indicate 
extensive self-learning phases in k-ART 



have until now been out of the reach of simulations. 

This work has been supported by the Canada Research 
Chairs program and by grants from the Natural Sciences 
and Engineering Research Council of Canada (NSERC) 
and the Fonds Quebecois de la Recherche sur la Nature et 
les Technologies (FQRNT). We are grateful to the Reseau 
Quebecois de Calcul de Haute Performance (RQCHP) for 
generous allocations of computer resources. 



Electronic address: f.el.mellouhi@umontreal.ca 

* Electronic address: normand.mousseau@umontreal.ca 

* Electronic address: laurent.lewis@umontreal.ca 

[1] A. B. Bortz and M. H. Kalos, J. Comput. Phys. 178, 10 
(1975). 

[2] K. Fichthorn and W. Weinberg, J Chem Phys 95, 1090 
(1991). 

[3] A. F. Voter, F. Montalenti, and T. C. Germann, Annu. 

Rev. Mater. Res. 34, 321 (2002). 
[4] A. F. Voter, Radiation Effects in Solids, ed. by K.E. 

Sickafus, E.A. Kotomin, and B.P. Uberuaga (Springer, 

NATO Publishing Unit, Dordrecht, The Netherlands, 

2007), vol. 235, chap. Introduction to the Kinetic Monte 

Carlo Method, pp. 1-24. 
[5] D. Mason, R. Rudd, and A. Sutton, J Phys-Condens Mat 

16, S2679 (2004). 
[6] T. Sinno, J Cryst Growth 303, 5 (2007). 
[7] O. Trushin, A. Karim, A. Kara, and T.S. Rahman, Phys 

Rev B 72, 115401 (2005). 
[8] G. Henkelman and J. Jonsson, J. Chem. Phys. Ill, 7010 

(1999). 

[9] F. Hontinfinde, A. Rapallo, and R. Ferrando, Surf Sci 
600, 995 (2006). 
[10] T. Middleton and D. Wales, J Chem Phys 120, 8134 
(2004). 

[11] G. T. Barkema and N. Mousseau, Phys. Rev. Lett. 77, 
4358 (1996). 

[12] R. Malek and N. Mousseau, Phys. Rev. E 62, 7723 



5 



(2000). 

[13] B. D. McKay, Congressus Numerantium 30, 45 (1981). 
[14] Y. Song, R. Malek, and N. Mousseau, Phys. Rev. B 62, 
15680 (2000). 

[15] F. Valiquette and N. Mousseau, Phys. Rev. B 68, 125209 
(2003). 

[16] F. El-Mellouhi, N. Mousseau, and P. Ordejon, Phys. Rev. 

B 70, 205202 (2004). 
[17] M.-A. Malouin, F. El-Mellouhi, and N. Mousseau, Phys. 

Rev. B 76 (2007). 
[18] M. Athenes, P. Bellon, and G. Martin, Phil Mag A 76, 

565 (1997). 

[19] F. Glover and M. Laguna, Tabu Search (Kluwer Aca- 
demics, Dordrecht, 1997). 

[20] M. V. Chubynsky, H. Vocks, G. T. Barkema, and 
N. Mousseau, J Non-Cryst Solids 352, 4424 (2006). 



